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Abstract — A challenging problem related to the design of polar 
codes is "robustness against channel parameter variations" as 
stated in Ankan's original work. In this paper, we describe how 
the problem of robust polar code design can be viewed as a 
mismatch decoding problem. We propose conditions which ensure 
a polar encoder/decoder designed for a mismatched B-DMC can 
be used to communicate reliably. In particular, the analysis shows 
that the original polar code construction method is robust over 
the class of binary symmetric channels. 

Index Terms — Mismatched channels, channel polarization, po- 
lar codes. 

I. Introduction 

In 2007, Ankan [1] proposed polar codes as an appealing 
error correction method based on a phenomenon called chan- 
nel polarization. This class of codes are proved to achieve the 
symmetric capacity of any binary discrete memoryless channel 
(B-DMC) using low complexity encoders and decoders, and 
their block error probability is shown to decrease exponentially 
in the square root of the blocklength |2|]. 

Two basic channel transformations lie at the heart of channel 
polarization. Given a B-DMC W : X —> y, the two successive 
channels characterized by these transformations W~ : X — >• 
y 2 and W + : X — > y 2 x X are defined by the following 
transition probabilities: 



W (t/it/2 | ui) = ^W(yi | ui ®u 2 )W (t)2 



U2) 



W + (yiy2U! | u 2 



u 2 ex 
1 



-W(yi | ui ® u 2 )W(y 2 | u 2 ) 



(1) 



(2) 



For a blocklength N — 2, these channels would be indexed 



as W% and W^' . In general N = 2™, and the channels 
W$ : X -> y x X i ~ 1 , for i = l,...,N, are synthesized by 
the recursive applications of these plus/minus transformations 
until sufficiently polarized, i.e. they are perfect or completely 
noisy channels. 

The polarization idea is used to propose polar codes, and 
the recursive process leads to efficient encoding and decoding 
structures. On the encoder side, uncoded data bits are sent 
only through those perfect channels. For the rest, bits are 
fixed beforehand and revealed to the decoder as well. On 
the decoder side, the synthesized channels lend themselves 
to a particular decoding procedure referred to as successive 



-(2) 



cancellation decoder (SCD). At the i-th stage, on those good 
channels, the SCD estimates the channel input Ui with law 
{yiu\~ l | ui) according to maximum likelihood (ML) 
decision rule for the i-th channel using the previous estimates 
and supplies the new estimate Ui to the next stages. The 
analysis carried in [1] shows that this SCD performs with 
vanishing error probability. 

A particular aspect of polar codes is that they are channel 
specific designs. The polarization process is adjusted to the 
particular channel at hand, whence the index set of the syn- 
thesized good channels. This set, referred as the information 
set A, is required both by the encoder and decoder. The 
situation in which this knowledge is partially missing have 
been already addressed. Let W and V be two given B-DMCs. 
The following two cases are known to lead to an ordering 
Av C Aw'- If V is a binary erasure channel (BEC) with 
larger Bhattacharyya parameter than the channel W, or V is 
a stochastically degraded version of W [1]. These results help 
the designer to use safely the information set designed for the 
channel V for communication over W. 

On the other hand, a critical point is the assumption of the 
availability of the channel knowledge at the decoder. Indeed, 
the described SCD not only requires the information set but 
also the exact channel knowledge to function. Therefore, if 
the true channel is unknown, the code design, including the 
decoding rule, should be based on a mismatched channel fl. 
In this work we assume the same SCD rule is kept, but 
instead of the true channel law a different one is employed 
in the decision procedure. We want to communicate reliably 
over the channel W using the polar code designed for the 
mismatched channel V (including the information set, encoder, 
and decoder), achieving rates up to the symmetric capacity of 
the mismatched channel V. 

The article follows with the preliminaries section, then we 
explore the results in the subsequent section, and the final 
section briefly discusses the results. 

II. Preliminaries 

To assess the performance of mismatched polar codes, we 
revisit expressions derived in [4] for the average probability of 



error under SCD with respect to a mismatched channel. These 
derivations follow closely the matched counterparts in 

The SCD described in the introduction is closely tied to a 
channel splitting operation. After channel combining, the split- 
ting synthesizes the channels whose transition probabilities are 
given by: 

<- X \ui) = £ JziW(y?\v?), (3) 



N 



where W(y? |<) - ]J W{y t \ Ul ). 



i=l 



We define the likelihood ratio (LR) of a given B-DMC 

W(y\l) 

W as Lw(y) — Decision functions similar to ML 

decoding rule can then be defined as 



0, ifZwj 



(?/f,C 1 )<l 



<®(v?,&ir 1 ) = { l, ifV««^i" 1 )> 1 > (4) 

[ * ifL <) ( 2/ f,tfr i ) = i 

where * is chosen from the set {0, 1} by a fair coin flip. 

The polar SCD will decode the received output in N stages 
using a chain of estimators from i = 1, . . . , N each depending 
on the previous ones. The estimators are defined as 

if i e A c 



Ui, 



if i G A 



(5) 



Let Pe(W,V, A) denote the best achievable block error 
probability over the ensemble of all possible choices of the 
set A c when |„4 = L-^-^J under mismatched successive 
cancellation decoding with respect to the channel V when the 
true channel is W. Then, one can show that 



Pe(W,V,A) <J2 Pe N(W,V), 



(6) 



where Pe^(W, V) is defined as 



2 2 N yyi 1 1 ' 1 v M (v N v 1 - 1 I v) 

v n n v N \y 1 , u 1 | u t ) 



with 1{.} denoting the indicator function as usual. 

For channels symmetrized under the same permutation, the 
next proposition can be proved using [1, Corollary 1]. 

Proposition 1: Let W and V be symmetric B-DMCs sym- 
metrized under the same permutation. Then, 

pe${w,v) = j2 w (y?\°^ H { L vy (y.or 1 )), ( 8 ) 

where H ^£ y w (j/^, 0^ —1 ) ^ is defined as the following sum 
l{L yW (y^Ot 1 ) > 1} + \l{L vW (y? ,0'f 1 ) = 1}. (9) 



For shorthand notation we will use L y (i) (yf) — 
L (i) (y 1 T ,0\~' )■ The next proposition explores the recursive 

V N 

structure of the LR computations. 

Proposition 2: [1] The LRs satisfy the recursion 



L v (2,-i){yl N ) 



L v y(y?) + L v y(y 2 N N +1 ) 
l+ N L v ^(yf)L^(y^ +1 ) 

2N 



(10) 



L v ^{yl N ) = L vW {y»)L vW {yl? +1 ). (11) 



Hence, the computed LRs can be seen as symmetric functions 
f( L v »i{yi), L v ^(VN+i)) of the arguments. 
We will use the following notation 



w 



L v y (y») > l] = E^(2/f K)HL V ^ (yf) > 1}. 



(12) 

Similar notation will hold for different sets considered within 
the indicator function. We will also use Kw [!{•}] = Pw [•] 
interchangeably. 

Given two B-DMCs W and V, we denote by l{L v (y) > 
1} W ~<sd l{L v (y) > 1} V if the distribution of the random 
variable l{Lv(y) > 1} under the distribution W^ylO) is 
stochastically dominated by the distribution under V^ylO). 
For a definition of stochastic dominance, see for instance ||5l 
Chapter 1.2, Theorem B]. By definition the condition implies 



Mr 



[F(l{L v (y) > 1})] < E v [F(l{L v (y) > 1})] (13) 



holds for any non-decreasing function F(.). As an example, 
the cases where W and V are BSCs with crossover prob- 
abilities e w < e v < 0.5 satisfy l{L v (y) > 1} W ~< S d 
l{Ly(y) > 1} V order. Similar notation will also be used 
for the l{Lv(y) < 1} random variable. 

Upper Bounds to Pe$(W, V): We give two channel 
parameters which upper bound Pe$ (W, V) for symmetric 
channels. The first one is simply Pw L.m (yf) > 1 when 
W and V are symmetrized under the same permutation. 
The second parameter, analogous to the Bhattacharyya pa- 
rameter defined for the matched scenario and referred to 
as the mismatched version of this quantity, is Z(W, V) = 
W(y\0)\/ Ly(y). Extending the definition to the i-th 

v 

synthesized channels, one can easily show that the bound 



(7) Pe^(W,V) < Z$(W,V) 



Z(W 1 



N > V N 



) holds for 



symmetric channels. Naturally, Pe(W, V) and Z(W, V) will 
denote the parameters when N = 1 and i = 1. For the matched 
case, we will simply write Pe$(W) and Z$(W). 

III. Results 

The next theorem states the main result of this paper. 

Theorem 1: Let W and V be two B-DMCs symmetrized 
under the same permutation which satisfy the following con- 
ditions: 

(i) P v [L v {y) < 1] > P v [L v (y) > 1], 

(ii) P w [L v {y) > 1] < P v [L v {y) > 1], 

(iii) P w [L v (y) < 1] > P v [L v {y) < 1]. 



Then, for any given N = 2 n with n = 1,2,... and any given 



1. 



we have Pe$(W,V) < Z^'(V). 



Moreover, 



Pe$(W, V) < Pe<$(V) holds for Vi G A 

Theorem Q] will be proved using the following lemma and 
the subsequent theorem. 



Lemma 1: The process 



tr 



KM" ) = i 



is a bounded 



submartingale in [0, 1] which converges almost surely to the 
values {0, 1}. 

The proof of Lemma Q] is given in the Appendix. 

Theorem 2: Let W and V be B-DMCs such that for a given 
N ~ 2 n with n = 0, 1,2, ... and a given i = 1, . . . ,N the 
following conditions hold: 



A) 


Pv 


L v v(y?)<l 


> 1 


V 


L v ^(y?)>l\ 


B) 


Pw 


V*,(yf)>l 


< 


Pv 


V«(i^)>i 


C) 


Pw 




> 


Pv 





Then, the basic polarization transformations preserve the above 
three conditions in the sense that, at the next level, they hold 
for the 2z-th and 2i — 1-th indices. 

An entire section will be devoted to the proof of Theorem 
[2] after we prove Theorem [1] 

Proof of Theorem Q} Assume the conditions (i), (ii), and 
(iii) hold. Then by Theorem [2] the conditions are preserved 
for the synthetic channels created by the polar transformations. 
Hence, for Vi = 1, . . . , N, we get 



"w 



Wyf)>i 



< 



Wyf)>l 



Knowing the bounds Pe ( S(W,V) < P w \L vW (y?) > 1 
(as we assumed W and V are symmetrizea under the same 
permutation) and P v i v (»)(yf) > 1 < Z(V { ^) apply, the 

L N J 

relation Pe^{W,V) < Z{V^ ] ) is proved. 

On the other hand, Proposition Q] shows that once channels 
are sufficiently polarized, either 



(14) 



"w 



or r w 
that the 



1 



0. Moreover, one can easily find 
first case lead to a completely noisy channel, and only 
the second case can lead to a perfect channel under a possibly 
mismatched decoding. As the inequalities 

Pe%\W,V) <P W 



~L vW {y»)>l 

. N 


< 1 


Py 


'V«(^)>i 








(15) 



hold, it turns out that, for those indices i £ A which 
correspond to the good channels' picked by the polar code 



designed for the channel V so that Py 
we have 

Pe$ (W,V)<Pe^(V), 



0, 



(16) 



Vz e A 

as claimed. This completes the proof of the theorem. ■ 
A. Proof of Theorem \2\ 

We first introduce a set of propositions needed in the proof. 
Proposition 3: For a symmetric B-DMC channel V such 
that the condition 



v 



Ly{i) 
1 AT 



< 1 



V 



v AT v 



> 1 



(17) 



holds for a given N = 2™ with n = 0, 1, . . . and for a given 
i = 1, . . . , N, the basic polarization transformations preserve 
the inequality, i.e. for j = 2i — 1, 2i, we have 



> 



V (VI N ) > 1 



(18) 



The proof of Proposition [3] is given in the Appendix. 
Proposition 4: For B-DMCs W and V, we have 



~w 



Wl/?")>1 



= E [w(.v?\o?)-v(v?\o?)]x 
E [w(v 2 N N +1 K) + v(y 2 N N +1 K)] > i}- 



(19) 

Proof of Proposition & We develop the right hand side 
of Equation ( fT9b 

E lw(vl N \oi N ) - v(y! N \oi N )] i{^d^) > 1} 

+ E w/ (^K)^+iK)x 
-E^+iK)^(^K)x 



i{/(V«tti).V(»i))^} 

v AT V AT 



W 



L vm (v?)>l 



1 

2N\ 



L vm (v?)>l 



(20) 



where we used the symmetry of the LR functions described 
in Proposition |2] ■ 
Proposition 5: For any B-DMCs W and V, we have 

l{L v (y) > 1} W ^sd l{L v (y) > 1} V 

iff P w [L v {y) > 1] < P v [L v {y) > 1] , (21) 



i{L v (y) < i} w y SD i{L v (y) < i} v 

iff P w [L v {y) < 1] > P v [L v (y) < 1] . (22) 

Proof of Proposition [3} The proposition follows by 
noting the random variables with the indicator functions are 
binary valued, so for both cases the two conditions are 
equivalent. ■ 

Proposition 6: E w ( », (yf N ) > 1}\1{L „ ro (yf) > 1}1 

(E w [l{L v t»> {yl N ) < {y?) < 1}] ) function is 

non-decreasing in 1{L v( «)(j/f) > 1} ( l{i v w (j/f ) < 1}Y 
The function E w 1{L (a4 _ 13 (y^) > 1}|1{£ (0 (j/f ) > 1} 

L V 2JV *2V J 

(e^^L^d^D^IIII^w^)^!}]) however, is 

non-decreasing in 1{L w (y?) > 1} (l{L w (yf) < 1}) if 
the following condition holds: 



i v w(Wi)<l 



> 



V->(yf)>i 



(23) 



Proof of Proposition The claims for the plus opera- 
tions are trivial. For the minus operation, the claims follow by 
noting that 



we know once again by Proposition [6] that this is always 
true for the plus transformation and is also true for the 
minus transformation if we have 



E 



l{L v(2 ^(yf)>l}\l{L vM (y?)>l} = 



= E 



l{L v ^(y™)<l}\l{L v v(y»)<l} = 



w 



L vW (y 2 N " +1 )>l} 



and both 



E 



GO >l}|l{Wj,f )>1} = 1 



> 



'w 



L V M" +1 )<1} 



(24) 



(25) 



L vW (y?)<l 



(29) 



Now we show that d29b holds. Taking the difference of 
the inequalities stated in conditions B and C, we get 



L vif) (y?)<l 



w 



> 



N \ 



< 1 



L vW (y?)>l 

. v N 

L vW (y?)>l 



>0, 
(30) 



E 



l{L v( „_ 1) ( 1 ,r)<l}|l{V«(wf)< 1 } = 1 



> 



L V M\ X )<1} 



(26) 



So by symmetry of y^ and yfj^_ 1 in the construction, the 
condition in d23l is sufficient to prove the monotonicity claims. 



where the non-negativity follows by condition A. 
C ) The proof can be carried following similar steps as 



in part B : 
by E 



w+v 



showing that the transformations defined 

l{L vm (y™) <l}\l{L vm (y») <1}} are 

AT 

also non-decreasing in 1{L a){y^) < 1} using Propo- 
sition [6] condition A, and Equation 



Proof of Theorem [2} 

A^) We know condition A is preserved by Proposition [3] 
B*) Using Proposition [4] we get 



■w 



L vW (yD>l 



E 



= EWiof)-«x 

\l{L vii) (yl N )>l}\l{L vii) (y?)>l} 



W+V 

where we have defined 

E w+V \l{L m ^)>l}\l{L 



(27) 



E K(^+i|0f ) + V(y% N +1 \0?)] l{X y H (yf N ) > 1}. 



V 2N 



It is useful to remark that for those B-DMCs W and V such 
that no output has a LR which equals to one, the assumptions 
(ii) and (iii) of Theorem [TJ can be merged into a single initial 
condition as Pe(W, V) < Pe(V). Following this remark, we 
now study in Theorem [3] the one step preservation properties 
related to the channel parameter Pe N . 

Theorem 3: Let W and V be B-DMCs symmetrized under 
the same permutation such that for a given N = 2" with n = 
0,1,2,... and a given i = 1, . . . , N the following conditions 
hold: 



A) 



L V M)<1 



B) Pe®(W,V) 



> 



l v M )>i 



Pe$ (V) <0. 



(28) 



Moreover by Proposition [5] condition B implies that 
HL V ^) > 1} W -< SD l{L vW (y?) > 1} V . So, 
we will be done if we show that the random vari- 



ables defined in (|281 > obtained after applying the polar 
transformations are both non-decreasing transformations 
in 1{L ji)(yf) > 1}. We consider the cases the 
expectations are taken under W and V separately. For 



Then, the minus polar transformation preserves these con- 
ditions. On the other hand, while the plus transformation 
preserves condition A, condition B may not be preserved in 
general. 



B. Proof of Theorem \3\ 

We first introduce two propositions needed in the proof. The 
proof of the propositions are given in the Appendix. 

Proposition 7: The quantities Pe^iW, V) — Pe^jfiV) can 
be recursively computed as 



i{WO>i}|i{Wyi)>i} 



we know by taking W = V in Proposition [6] and by 
condition A that this claim holds. For 



Pe 



E 



w 



1{ WO >i}|i{V*> (<)>!} 



(2i-l) 
2N 

E 



(W^-Pe^iV) 

[W(y?K) - V(y»\0?)] H (L v ^(y? )) K N , 



(31) 



where 



K 



N 



E [w(y 2 N N + iK) 



V(y 2 N N +1 K)] 



2N 

Vn + i- 

L vW (vl N +1 )<l 



\ 



E [w(v 2 N N +1 K) + v(y% N +1 K)] 



v 2N ■ 
t/N + l- 

Ai) (y 2 N N +1 )>i 



V N 



(32) 



and 



Pe™(W,V) - Pe™(V) = ^ [W{V?\0? ) - V(y?\0?j\ x 
[w(y% N +1 \0?) + V(y% N +1 \0?)] H (L v p(y 



N\ T i IN 



(33) 



Proposition 8: Assume W and V are B-DMCs such that the 
conditions A and B of Theorem [3] hold for a given N — 2" 
with n = 0, 1, 2, . . . and a given i = 1, . . . , N. Then, 



V->(^)< 1 



> 



(34) 



Proof of Theorem [3} 
A 1 * 1 ) We know condition A is preserved by Proposition [3] 
B~) For the minus transformation, we have by Proposition Q 



Pe 



(2t-l) 



(W, V) - Pe 



(2t-l) 
2N 



(V) 



Pe<${V) 



K 



N ■ 



(35) 



Now, we claim that Kjy > 0, from which the sign of 
Pe { ^~ 1] (W,V) - Pef^ l) (V) < follows. To prove 
the claim, note that by equation 
equals to 



, the constant K 



N 



'W 



L v w {y 2 N N +i) < i 



L v w (vn N + i) < 1 



L v w (y 2 N N +i) > i 



(36) 



Then, the non-negativity of Kn follows by condition A 
and Proposition [8] which shows the conditions A and B 
imply 



< 1 



> 



w 



(37) 



B + ) We give a counterexample: Let W be a BSC of crossover 
probability 0.3 and V a symmetric B-DMC with y = 
{0, e, 1} such that the LRs take the values {1/4, 1,4} 
with probabilities V(y|0) = {0.4, 0.5, 0.1}, respectively. 
One can check that although conditions A and B are 
satisfied for N = 1 and i = 1, condition B fails to hold 
after the plus transformation for N = 2 and i = 2. 



We saw in Theorem [3] that we need to impose some more 
constraints on the mismatch channel to be used if we want to 
ensure condition B is preserved under both transformations . 

Consider the mismatched Bhattacharyya parameter we de- 
fined as Z(W, V) = E W(y\0)y/L v (y). After applying the 
y 



as in the matched case shown in 111. Therefore, we have 



plus polar transformation we get (W, V) 



Z$(W, V) - Z$(V) < Z%>(W, V) - Z^>(V) < 0. 

(38) 

In the next theorem, we explore the possible connection of 
such a result with Theorem [3] 

Theorem 4: Assume the channels W and V described in the 
hypothesis of Theorem [3] also satisfy the following conditions 
for any N — 2™ with n = 1, 2, . . . and for any i = 1, . . . , N: 

Pe$(W,V)-Pe$(y)<0 iff Z$(W,V)-Z$(V) < 0, 

(39) 

Then, the condition B of Theorem [3] is preserved under both 
polar transformations. 

The theorem statement simply tells that if the Bhattacharyya 

(i) 

upper bounds follow the same behavior as their Pe N coun- 
terparts; which can occur if for instance they are sufficiently 
tight for both the matched and mismatched error probabilities 
at any level, then as long as we design the polar code for 
a mismatched channel V such that Pe(W, V) < Pe(V) is 
satisfied, we are safe to use the code over the channel W. 
Although Theorem |4] provides a partial solution to the design 
problem, unfortunately it is non-constructive at this stage. We 
would need to study which channels could satisfy these type 
of constraints. 

IV. Discussions 

We took a designer's perspective to analyze the performance 
of mismatched polar codes, and we identified in Theorem Q] 
conditions under which the polar code designed using Ankan's 
original construction method [1] for a given B-DMC can be 
used reliably for a mismatched channel. Are these conditions 
(i), (it), and (Hi) given in Theorem Q] terrestrial? We give 
a positive answer by showing the set of BSCs of crossover 
probabilities ew < ey < 0.5 satisfy the three conditions: (i) 
is equivalent to 1 — ey > ey, (ii) is equivalent to ew < £v> 
and (Hi) to 1 — e\y > 1 — ey. As illustrated in this specific 
example the conditions are rather natural ones, and perhaps, 
they even hold for other specific class of channels. 

The robustness of polar codes over BSCs have also been 
previously discussed in 0]. Theorem 1 in shows that 
replacing the minus polar transformation with a specific ap- 
proximation results in the LRs of the synthesized channels 

l,...,7Yas 



(2i), 



W$ and vjp to be ordered for each % 



1 < L vW (t/f , ui" 1 ) < L w (t/f , ur 1 ) , (40) 
or L <o (yf,^- 1 ) <iy« (Vi^r 1 ) < 1, (41) 

vv N v N 

where the symbol ~ indicates computations use the approxi- 
mation. So, the decoder estimate for a given output realization 



m = \ 



ti>ie 2 <i/ei 



Li+L 2 = 
\ + L x L 2 



£!>ie 2 >ei 



[LxL 2 = 1] 



L 1 +L 2 = 



> 2F W [Li = 1] . =P [L x < l] 2 - -P [Li = if 



£ h^[L 1 =£ 1 ]F[L 2 = £ 2 ] 
ti>i i<e a <ti 



(46) 



will be identical whether the computations are performed For the plus transformation, we use a property following 
with respect to the approximated LRs of the channel W from the symmetry of the channels 
or the channel V. In this case, for any i = l,...,N, 

Pe^ (W, V) — Pe^ (W) holds as well. Although the decoder W{y |1) 1 

is completely robust, no theoretical analysis is provided to W(y|fJ) = Jj7j\ ^ F \L(y) = = ~^ 
argue what rates can ultimately be achieved by a successive 

cancellation decoder using the approximate computations. On \jy e define the following notations 
the other hand, here, a consequence of Theorem 1 is that the 
compound capacity [6] of the set of BSCs, i.e. the capacity 

of the worst BSC in the set, is achievable by the polar code p \L 1 > 1] = P \L\ > 1] + -P \L\ = 1] , (47) 

designed for this worst channel. ^ 

P [Li < 1] = P [Li < 1] + -P [In = 1] . (48) 
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P[£iL 2 <l] 

VL APPENDIX = £ £ P [Ll = 4 ] P [L2 = i 2 ] 

In this Appendix we prove Lemma Q] and Propositions [3] e!<n 2 <i 
ElandE] £ V[L 1 =£ 1 }F[L 2 =£ 2 ] 

ii<\ i<e 2 <i/ii 

Proof of Lemma\l} The boundedness claim is trivial. Let \ ^ 

L X = L yW (j/f ) and L 2 = (y™ +1 ) for simplicity. We + 2_, 2^ P ^ = ^ P ^ = ^ 

first note that 



-ip[L 1 = l] 2 (49) 
= 2F w [L=l]-F w [L = l] 2 , (42) 2 

=P[L 1 <1] 2 --P[L 1 = 1] 2 

F w [LxL 2 = 1] > F w [L = l] 2 (43) + £ £ [Li = h]F [L 2 = £ 2 ] 

a <i>n<&<4 
where we used the fact that Fw \L = 1 = Pw \L\ = 1 = 

F W [L 2 = 1]. Therefore, + £ £ ^ ^ = ^ P t^ 2 = ^ ( 50 ) 



1 + L1-L2 

(44) 

This inequality proves the process is a submartingale. By 

general results on bounded martingales, we know the process ,„ 
converges almost surely |Q]]. One can complete the proof that + P [-^1 — 1] ^P L-^i — ^1] 

the convergence is to the extremes in a similar fashion to the ll>1 
proof carried in [Q] Proposition 9] of the convergence to the + l 2 F \L\ = £\\ F [L 2 = i 2 \ 

extremes of the Bhattacharyya parameters' process associated lt>\i 2 >ii 

to the polarization transformations. By using (l42l . we have = 1] £ £ 9 p \J_, X = £ 2 ]-\-F [L\ = l] 2 (51) 

E± [|Pw [i ± = l]-F w [L= 1] I] ' 2> \ 

1 =P[L 1 <1] 2 + -P[L 1 = 1] 2 

> -Pvk [i = 1] (1 - PVK [i = 1]) , (45) 4 

~ 2 Wl JV ^ 1 J; ' +P[i x = l]P[ii <1] 

and when the left side of this inequality goes to zero, {0, 1} + £ £ F i L i = ^ F I L 2 = £2} max{£i,£ 2 } (52) 

are the only possible values Pw [L — 1] can take. ■ ti>ie 2 >i 

=F[L 1 < l] 2 

Proof of Proposition 0- For simplicity we define + £ £ P [£1 = 4] P [£2 = 4] max{£i, £ 2 } (53) 

L Y w (yf) = Li, L v w (1$%) = L 2 , and omit the subscript ^ 
in Py Note that by symmetry in the construction of polar 

codes P [I\ < 1] = P [L 2 < 1]. where we abuse the notation to define (note the > sign in the 



summation index) 

E ^P[ii=^i]F[L 2 =^ 2 ]max{4,4} 

^1>1 ^2>l 

^^PIL^IJP [L 2 = 4] max{4,4} 
^i>i£ 2 >i 

+ P [Li = 1] ^ [ii = * 2 ] + [£i = I] 2 • (54) 

£ 2 >i 

In the same spirit, we define 

E E P [Li = *i] P [La = h\ min{4, M 

^1>1 ^2>l 

= 53 53 P [Li = h] P [La = h\ min{4, M 

+ P[Li = l]P[Li > l] + -P[Lx = 1] 2 , (55) 
and we note that 

53 53 P [Li = h] P [L 2 = * 2 ] x 

^1>1^2>1 

(max{fi,f2} + min{£i,£ 2 }) 

= 53 E p t L i = £ i] p i L 2 = ^a] & + 

^i>i^ 2 >i 

=2P [Li < 1] P [Li > 1] . 



(56) 
(57) 



As 



1 = P[L X L 2 < 1] +P[L X L 2 > 1] 
= (P[Li < 1] +P[Li > l]) 2 
= P[Li < l] 2 +P[Li > l] 2 



(58) 
(59) 



- 2P [Li < 1] P [Li > 1] (60) 



must hold, we get 



P[LiL 2 > 1] =P[Ll > l] 2 

+ 53 53 p [ jLi = £i ] p [ L2 = i?2 ] min ^ 1 ^ 2 }- (61) 



«1>1^2>1 

Therefore, d53j and (|6D proves that 

P[LiL 2 < 1] > P[LiL 2 > 1] 
holds as claimed. For the minus transformation, we have 

L\ +L 2 



1 + LiL 2 
" Li + L 2 



< 1 



> 1 



[Li < 1] 2 + P[L! > l] 2 



= 2P [Li < 1] P [L a > 1] . 



.I + L1L2 

By noting that the difference of these equals 

(P[Li < 1] -P[Li > l]) 2 > 0, 
the claim for the minus transformation is proved. 



(62) 

(63) 
(64) 

(65) 



Proposition^ First note that for symmetric B-DMCs W 
and V symmetrized under the same permutation, we have 

Pef N {W,V)-Pef N (V) = 53 [W(y? |0f ) - V(v? |0f )] 



E w + iiof)+n^ + iiof)]H(L e) ( y n) 1 , 

(66) 



which can be proved similarly to Proposition (|4j. For simplic- 
ity we define L w (j/f ) = L x , L„ (i) (j/^+i) = L 2 . First 
observe that 



H 



Li +L 2 
l + LiL 2 



1 

2' 


if 
or 


Li 
L 2 


= 1, 
= 1 








1, 


if 


Li 


< 1 


and 


L 2 


> 1, 




or 


Li 


> 1 


and 


L 2 


< 1 


0. 


if 


L a 


< 1 


and 


L 2 


< 1, 




or 


Li 


> 1 


and 


L 2 


> 1 



(67) 



Then, we have 



= 53 Wiof)-n2/fiof)]xi 



Vl ■ 
L 1 = l 



+■ E [w%nof)-nyfion]x 

Li>l 

E [wft^iiofj + v^iof)] 

„2iV . 

t 2 <l 

+■ E Wlof)-nyf|of)]x 

Li<l 

E [w(y% N + iK) + v(y 2 N N +1 \o?)] 

2N 

Vn + i- 
L 2 >1 

E [W(v?\0?)-V(3f?\0?)]xl 
-1 

E W|0f)-V(yf|0f)]x 

JV 

Vl ■ 

'.i>l 

E W + ilQf) + n^i|of)]H(L 2 ) 

E W|0f)-V%f|0?)]x 

E [^(2/ 2 7 + i|0f) + ^ + i|0f)]H(L 2 ) 



(68) 



Vl ■ 



Li<l 



(69) 



where the "complement" function of H is defined as 

H(L 1 )4l{L 1 <l} + il{L 1 = l}. (70) 

By substituting H(-Li) = 1 — H(Li) and regrouping the 
terms, we obtain 

Pe^- l \w,V) - Pe^\v) 

= J2[ W (yi\°i)- V (vm)] 2H(£x) 

+ E i w (y?K) - viviK)} [i - sh m x 

E [^+i|0f) + ni^i|0^)]H(i a ), (71) 



„2« 



where we used the fact that 1 - 2H(ii) = l{Lx < 1} - 
l{Li > 1}. Now, note that the term in the second summation 
with the 1 sums to 0. Hence, we get 

= E [W{y?\^)-V{y»\Q»)]K{L x )x 



Hence, 

P w [L(j/f ) < 1] - P w [L(y?) > 1] 

> P v [L(y?) < 1] - Py [£(yf ) > 1] > 0, (76) 

where the non-negativity follows by condition A. ■ 
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2- E [W(y 2 N N + iK) + V(y 2 N N + ,K)]21I(L 2 ) 

y 2N 
VN + 1 

= EWl°i)-^f|0f)]H(L 1 )x 

l/f 

E [W«i|0f) + n^+i|0f)][l-2H(i 2 )]. (72) 

2N 

Vn + i 

We recover Equation (OH upon noticing Kn defined in (l32i t 
equals 

E [W(y% N +1 \Q?) + V(y 2 N N +1 \Q?)][l-2H(L 2 )} (73) 



y 2N 



as 1 - 2H(L 2 ) = 1{L 2 < 1} - 1{L 2 > 1}. This proves 
the claim for the minus transformation. The claim for the 
plus transformation can be obtained directly by the expression 
given in (1661 , ■ 
Proof of Proposition^ We have 

1, 



P w [L(t/f ) > 1] + -P W [L(j/f ) = 1] 



Pv lL(y?)>l\--V v LW) = iJ 



W) < i] 



W) = i] 



- P w [L(yf ) < 1] - -P w [L(y? ) = l] < 0, 

(74) 

where the negativity follows by condition B. Therefore, adding 
both sides gives 

P w [L(y? ) > 1] - Pv [L(y») > l] 

+ P v \L(y?) < 1] - P w \L(y N ) < l] < 0. (75) 



